#Create simulate training data for our SVM exercise
set.seed(1)
#plot to see the structure of the data we created
#specify a recipe where we center to mean of 0 and scale to sd of 1
#Create linear SVM model specification

In SVM, the cost parameter influences the width of the margin around the separating hyperplane. A smaller C allows a wider margin but more misclassifications are allowed. Recall that we can improve generalization by accepting more errors on the training set. A larger C aims for a narrower margin that tries to correctly classify as many training samples as possible, even if it means a more complex model.

#Bundle into workflow
#Fit workflow
#Plot the fit from kernlab engine
#As usual we want to tune our hyperparameter values

#Finalize model and fit

#Create a small test data set
set.seed(2)

We can use augment() from {broom} to use our trained model to predict on new data (test data) and add additional info for examining model performance.

That went well, but makes SVMs really interesting is that we can use non-linear kernels. Let us start by generating some data, but this time generate with a non-linear class boundary.

#Fit the new specification
#Plot the fit
#Create the test data
#Examine model performance via confustion matrix

ROC Curves

#We can examine our model's performance using ROC and AUC